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Abstract We have designed and implemented a novel way to process wide- 
field astronomical data within a distributed environment of hardware resources 
and humanpower. The system is characterized by integration of archiving, cal- 
ibration, and post-calibration analysis of data from raw, through intermediate, 
to final data products. It is a true integration thanks to complete linking of 
data lineage from the final catalogs back to the raw data. This paper describes 
the pipeline processing of optical wide-field astronomical data from the WFQ 
and OmegaCA]V|^ instruments using the Astro-WISE information system (the 
Astro-WISE Environment or simply AWE). This information system is an envi- 
ronment of hardware resources and humanpower distributed over Europe. AWE 
is characterized by integration of archiving, data calibration, post-calibration 
analysis, and archiving of raw, intermediate, and final data products. The true 
integration enables a complete data processing cycle from the raw data up to 
the publication of science-ready catalogs. The advantages of this system for 
very large datasets are in the areas of: survey operations management, quality 
control, calibration analyses, and massive processing. 

Keywords wide-field imaging • data processing • information system 



1 Introduction 

The rapid increase in the number of astronomical data sets and even faster 
increase of overall data volume demands a new paradigm for the scientific 
exploitation of optical and near-infrared imaging surveys. Historical surveys 
have been digitized (POSS and its southern counterpart) or are in the process 
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of being digitizecQ In recent years surveys have been performed which cover 
hundreds or thousands of square degrees up to the whole sky (SDSS, 2MASS, 
CFHTLS, etc.). Many more are in progress or coming up with increasing 
spatial resolution, depth, and survey areas (OmegaCAM on VST, VIRCAM 
on VISTA, Pan-STARRS, LSST, etc.). The data rate of existing surveys is 
rapidly approaching terabytes per night, leading to survey volumes well into 
the petabyte regime and the new surveys will add many tens of petabytes to 
thifj^ Hundreds of terabytes of data will start entering the system when ESO's 
OmegaCAM camera starts operations in Chile in late-2011. Several large sur- 
veys plan to use the Astro-WISE information system to manage their data: 
the 1500 deg'^ KIDS Survejj^ the Vesuvio Survejj^ of nearby superclusters, 
the Omega Whitej^ white dwarf binary survey and the OmegaTrantj^ search for 
transiting variables. 

Quality control is typically one of the largest challenges in the chain from 
raw data of the "sensor networks" to scientific papers. It requires an environ- 
ment in which all non-manual qualification is automated and the scientist can 
graphically inspect where needed by easily going back and forth through the 
data (the pixels) and metadata (everything else) of the whole processing chain 
for large numbers of data products. The full quality control mechanisms are 



treated in complete detail in the Astro-WISE Quality Control paper (McFar 



land et al. 2011) 



The really novel aspect of this new paradigm is the long-term preservation 
of the raw data and the ability of re-calibrating it to the requirements of new 
science cases. The data of the majority of these surveys is fully public: any 
astronomer is entitled to a copy of the datgj^ Therefore the same survey data is 
used for not only science cases within the original plan, but many new science 
cases the original designers of the survey were not planning to do themselves 
or did not foresee. To be able to do this successfully requires that everyone is 
provided access to detailed information on the existing calibration procedures 
and resulting quality of the data at every stage of the processing, that is, have 
access to the data and the metadata, including process configuration at every 
step in the chain from raw data to final data products. 

In this paper we describe the reduction of data in the Astro-WISE infor- 
mation system, generally referred to as the Astro-WISE Environment (hereafter 
AWE) . The processing of data from both the WFI and OmegaCAM instruments 
has been used to qualify the pipeline, the results of which have been or will be 



^ See, e.g., http:/ /archive. stsci.edu /dss/[ 
mttp: / /tdc- www. harvard.edu/plates/ 

pittp:// www.lsw.uni-heidelberg.de/projects/scanprojcct/ 

^ See, e.g., http://www.lsst.org/lsst/science/technologyl 
jhttp: / / pan-starrs.ifa.hawaii.edu/public / design-features / d ata-handling. html| 
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included in separate publications, for example Verdoes et al. (2007); Valentijn 



et al. (2011). The remainder of this section briefly describes some key concepts 



of AWE covered in detail elsewhere: previously in Valentijn et al. (2007) and 
more currently in Begeman et al. (2011). Sections [2| and [3| describe how an 
instrument is calibrated and how science data is processed. Finally, Sect. [4] 
presents the summary. 



1.1 Context 

Context is the primary tool of project managers in AWE. Each process target 



(i.e., the result of some processing step, see Sect. 1.2.2) in AWE is created at a 
specific privilege level. Privilege levels are analogous to the permission levels of 
a Unix/Linux file system (e.g., privilege levels 1, 2, 3 map loosely to permission 
levels user, group, other). To allow access to their desired set of objects, users 
can set their privilege level and their project. 

This concept of context is completely about visibility of the objects in 
AWE and nothing else. Proprietary data is protected from access by all but 
authorized users and undesirable data can be hidden for any purpose (e.g., 
to use project-specific calibrations instead of general ones). All processing is 
done within this framework, allowing complete control over what is processed 
and how, and how it is published between project groups and to the world. 

Visibility for processing targets is not only governed by the privilege level, 
but also by validity. Three properties dictate validity: 

1. is.valid - manual validity flag 

2. quality_f lags - automatic validity flag 

3. timestamps - validity ranges in time (for calibrations only) 

Determining what needs to be processed and how is indicated by setting any or 
all of the above flags. For instance, obviously poor quality data can be flagged 
by setting its is_valid flag to 0, preventing it from ever being processed auto- 
matically. The calibrations used are determined by their timestamps (Which 
calibrations are valid for the given data?) and the quality of processed data by 
the automatic setting of its quality_flag (Is the given data good enough?). 
Good quality data can then be flagged for promotion (is_valid > 1) and 
eventually promoted in privilege by its creator (published from level 1 to 2) 
so it can be seen by the project manager who will decide if it is worthy to be 
promoted once again (published from level 2 to 3 or higher) to be seen by the 
greater community. 



1.2 Provenance: full dependency linking 

AWE uses its federated database to link all data products to their progenitors 
(dependencies), creating a full data lineage of the entire processing chain. This 
allows creation of complete data provenance for any data item in the system 
at any time. 
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Fig. 1 A target diagram: slightly simplified object model that is a view of the dependencies 
of "targets" to raw observational data. The arrows indicate the backward chaining to the 
raw data, not the progression through any processing pipeline. The colors provide a visual 
grouping of similar types of data products. 



1.2.1 Full data lineage 



Raw data is linked to the final data product via database links within the 
data object, allowing all information about any piece of data to be accessed 
instantly. See Mwbaze et al. (2009) for a detailed description of the AWE's data 
lineage implementation. This data linking uses the power of Object-Oriented 
Programming to create this framework in a natural and transparent way. 



1.2.2 Object-oriented data model 



AWE uses the advantages of Object-Oriented Programming (OOP) to process 
data in the simplest and most powerful ways. In essence, it turns the afore- 
mentioned data objects into OOP objects, called process targets (or Process^ 
[Targets), that are instances of classes with attributes and methods that can 
be inherited (see Fig. [l] and [5] for an overview of an Astro-WISE object model). 
Each of these ProcessTarget instances knows of all of its local and linked 
metadata, and knows how to process itself. Each persistent attribute of an 
object is linked to metadata or to another object that itself contains links to 
its own metadata. 

The code for AWE is written in Python, a programming language highly 
suitable for OOP. Consequently, Python classes are associated with the various 
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Fig. 2 A Astro-WISE hierarchical object model. A simplified object model of the target 
classes shown in Fig. [l] illustrating their inheritance relationship to each other. The classes 
without color do not appear in the previous figure, but are nonetheless part of the hierarchy 
and are shown for clarity. Every target inherits from DBObject (a database object), but 
only those with associated bulk data (typically a file stored on a dataserver) inherit from 
DataObject 
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conventional calibration images, data images, and other derived data products. 
For example, in AWE, bias exposures become instances of the RawBiasFrame] 
class, and twilight (sky) flats become instances of the RawTwilightFlatFraine^ 
class. These instances of classes are the "objects" of OOP. 

For the remainder of this document, the class names of objects, their prop- 
erties, and methods will be in teletype font for more clear identification. 

1.3 Target-based processing 

The most unique aspect of AWE is its ability to process data based on the 
final desired result to an arbitrary depth. In other words, the data is pulled 
from the system by the user. The desired result is the target to be processed, 
and the framework used is called target processing. Target processing uses 
methods similar to those found in the Unix/Linux make utility. When a target 
is requested, its dependencies are checked to see if they are up-to-date. If there 
is a newer dependency or if the requested target does not exist, the target is 
(re)made. This process is recursive and is an example of backward chaining. 

1.3.1 Backward chaining 

At the base of AWE target processing is the concept of backward chaining. 
Contrary to the typical case of forward chaining (e.g., objectN is processed 
into objectN+1 is processed into objectN+2, etc.). AWE database links allow 
the dependency chain to be examined from the intended target (even if it does 
not yet exist) all the way back to the raw data. The above scenario would 
then look like: if targetM is up-to-date, check if targetM-1 is up-to-date; if 
targetM-1 is up-to-date, check if targetM-2 is up-to-date; etc., processing as 
necessary until targetM (and all targets it depends on) exists and is up-to- 
datj^ This is the AWE implementation of backward chaining that is used in 
target processing (see Fig. [I] for an example with astronomical data). 

1.3.2 Processing parameters 

As mentioned earlier, conventional astronomical calibration images/products 
as well as science products are collectively referred to as process targets and 
inherit from the ProcessTarget class. Each ProcessTarget has an associated 
processing parameters object, an instance of a class named after the respective 
process target class (e.g., SomeTarget . SomeTargetParameters) which stores 
configurable parameters that guide the processing or reprocessing of that tar- 
get. Those [ProcessTarget^ that use external programs in their derivation may 
have additional objects associated with them which contain the configuration 
of the external program that was used. 



Note that the eounting of targets is reversed in the backward chaining example, as this 
is the direction in which the up-to-date check is run. 
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Fig. 3 A screen-capture of part of the web-based target processing interface. On the left 
are high-level processing settings (e.g., project, processing step, options). On the right is 
the result of the query for a particular target. Green rows show dependencies that are ready 
and will not be processed, red and orange rows show dependencies that are either outdated 
(need to be rebuilt) or already have a new version available. This section is a glimpse at 
the information used to dynamically construct the workflow that will create the eventual 
processing pipeline. Only those targets in the red rows will actually be processed. 



These processing parameters are stored in an object linked to the jProcess-| 



Target for comparison by the system and to allow the all persons involved in 



survey operations to discover which settings resulted in the best data reduc- 
tion. 



1.4 On-demand reprocessing 

AWE combines all of the above concepts into a coherent archiving and processing 
system. All the information about a particular instrument and its calibration 
and processing history is stored in the federated database within the object- 
oriented data model with full linking of the data lineage. The values of the 
process parameters of all objects in the dependency chain and all the results of 
the integrated (and manual) quality controls of the target of interest (regardless 
of visibility or existence) are used to determine if that target can or should be 
(re)built and how. This data pulling is the heart of AWE and is called target 
processing (see Fig. [s] and http://process.astro-wise.org/| . 



l.J-t Raw data sacred 



As mentioned earlier, AWE does not provide as the ultimate end of the pro- 
cessing chain a static data release. The system allows for survey data to be 
reprocessed for any reason and for any purpose. If a newer, better calibration 
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Fig. 4 Schematic flow of the pixel calibrations pipeline following the coloring in Fig. [ij The 
recipes, also called Tasks, used to produce various ProcessTargets are indicated in each 
box (with their data product in parentheses) and described in the various sections. The 
arrows connecting them indicate the direction of processing. Note that the sections with the 
hatched boxes are optional branches in this pipeline, and the arrow at the end leads to the 
beginning of the photometric pipeline schematic in Fig, [s] Also note, in order to simplify 
this diagram, the |GalnLinearlty[ |DarkCurrent) and |HightSkyFlatFrame| objects have been 
omitted. 



is made, or if a different purpose requires a different processing technique, the 
data can be easily reprocessed. This is only possible when the raw survey data 
is retained in its original form. In AWE raw data is always preserved. 

1.4-2 On-the-fly (re)processing 

Target processing does not use static information to determine what gets pro- 
cessed how. As seen in all the previous sections, all the survey data, its depen- 
dency linkages and processing parameters are all reviewed to allow any target 
to be (re)processed on-demand as needed. All these dependencies create a 
built-in workflow, automatically processing only those targets that need it. 
This on-the-fly (re)processing is the hallmark of the AWE information system. 



2 Calibration Pipeline: correcting the pixels 

The philosophy of AWE is to share improved insight in calibrations. In AWE, cal- 
ibration scientists can, over time, have many versions of calibration results at 
their disposal. From this they determine (subtle) long term trends in instru- 
ment, telescope and atmospheric behaviour and can collaborate to improve 
the calibration procedures for that instrument in AWE accordingly. The com- 
plete observational system (generally termed "the instrument" for simplicity) 
eventually becomes calibrated over its full operational period as opposed to 
a series of individual nights calibrated from data in a limited time window. 
Fig. |4] shows the schematic view of the pixel calibrations pipeline. This gives 
an overview of the flow of the pixel calibrations to be described in the coming 
sections. It is continued in the photometric pipeline schematic in Fig. [5] 

In the AWE, calibration objects have a set validity range in time or per 
frame object that depends upon the calibration object (the defaults are spec- 
ified per calibration object in Table [l] below) . The default validity time range 
(timestamp_start to timestamp_end) can be altered on the command-line us- 
ing context methods (see Sect. Pi), or via the CalTS web-service (see Fig. |6|. 
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Table 1 Default validities of calibration jProcessTarget i. All time spans are centered on 
local midnight of the day the source observations were taken unless otherv^ise indicated. 





Class 


process_param 


value 


units 




ReadNoise 


rejection_threshold 


5.0 








maximum Jterat ions 


5 






GainLinearity 


overscan_correction 


6 








rejection_threshold 


5.0 








maximum Jterat ions 


5 






BiasFrame 


overscan_correction 


6 








sigma_clip 


3.0 




|HotPixelMap| 


rejection_threshold 


5.0 








maximum Jterat ions 


5 






ColdPixelMap 


threshold Jow 


0.94 








threshold-high 


1.06 






DomeFlatFrame 


overscan_correction 


6 








sigma_clip 


3.0 






TwilightFlatFrame 


overscan_correction 


6 








sigma_clip 


3.0 






MasterFlatFrame 




dig_filter_size 


9.0 








mirror jcpix 


75 


pixel 






mirror.ypix 


150 


pixel 






median_filter_size 


36 


pixel 






combine_type 


1 






PhotometricParameters 


sigclip.lcvel 


1.5 








min_nmbr_of_stars 


3 





Table 2 Processing parameters and their generic default values. These values are repre- 
sentative of the typical value for any instrument. Some instruments may have values that 
different from these based on experience with that instrument. See the document page linked 
from the class name or appropriate links on |http: / / doc.astro-wise.org/ astro. main. html| for 
more details. 
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Fig. 5 Schematic flow of the photometric pipehne following the coloring in Fig. [l] The 
recipes, also called Tasks, used to produce various 'ProcessTargets are indicated in each 
box (with their data product in parentheses) and described in the various sections. The 
arrows connecting them indicate the direction of processing. Note that the sections with the 
hatched boxes are optional branches in this pipeline, and the input follows from the pixel 
calibrations pipeline shown in Fig. |4] 
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Fig. 6 A screen-capture of CalTS, the web-based Calibration TimcStamp service. The 
purpose of this service is to give a graphical representation of the temporal validity ranges 
of calibration objects in AWE. On the left can be selected the ProcessTarget of interest, 
at the top are some of the query criteria, and below this, the graphical validity of the 
ProcessTarget Colored bars indicate the most recent valid objects (objects flagged invalid 
are hidden), while black bars indicate where objects are "eclipsed" by newer calibrations. It 
is always assumed that the newest valid ProcessTarget is the best and this will be the one 
used during processing. The timestamps and validity can be modified by an interface raised 
by clicking on the date range for a given object, http: / / calts.astro-wise.org/, 



Be sure to note that, with the exception of parts of the astrometric cah- 
bration derivation and most of the photometric caUbration derivation, all cali- 
bration objects are normally processed in a parallel environment, one detector 
chip per CPU node. 

Many ProcessTarget's have configurable processing parameters to control 
how they are processed. Table[2]gives an overview of these process_params for 
the calibration pipeline. In addition to the process_params associated directly 
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with the ProcessTarget, there exist object representations of configuration 
files for external programs wrapped in Python (e.g., SExtractor, SWarp, etc.). 

2.1 IReadNoiiil 

The read-out noise is the noise introduced in the data by the read-out process 
of detector chips. It is measured from pairs of bias exposures. The RMS scatter 
of the differences between two bias exposures is computed. The read noise in 
ADU is determined via division of this value by \f2. The read noise value is 
stored in the database using the ReadNoise class. 

2.2 G ainLinearityl 

The gain is the conversion factor between the signal in ADU's supplied by the 
readout electronics and the detected number of photons (in units e^/ADU). 
For OmegaCAM, a procedure (template) to determine the gain (and the lin- 
earity of the detector chips) is defined that involves taking two series of 10 
dome flatfield exposures with a wide range of exposure times, and deriving 
the RMS of the differences of two exposures taken with similar exposure (in- 
tegration time). The regression of the square of these values with the median 
level yields the conversion factor in e^/ADU (assuming noise dominated by 
photon shot noise). A linear fit of exptimes vs. median_sum gives a measure of 
the linearity. For most instruments default gain values have been determined 
or taken from the literature and are in the system, so it is usually not necessary 
to make new values for them. If this is desired, a specialized dataset similar to 
that described must be used. The class used to store the gain in the database 
is the ,GainLinearity, 

2.3 IBiasFramel 

The signal in raw scientific frames contains a component that is due to a 
bias current introduced by the AD converter on a FIERAp] or other detector 
controller. This component shows up as an offset to the signal. In most CCD 
detectors, the bias-offset has the following characteristics: i) the bias level 
grows to its asymptotic level in the first few hundred lines, and ii) the bias 
level depends on the total signal in a given line. Therefore, an initial bias 
correction-the overscan correction, is applied when the overscan region exists 
(cheaper CCDs and IR detectors tend not to have these regions). The method 
used is one of a set of methods ranging from no correction, to subtraction 
of a constant value derived from one of the prescan or overscan regions, to 
subtracting an average value per column or row, smoothed or not, to hybrid 

Acronym for Fast Imager Electronic Readout Assembly CCD controller, 
|http: / / www.eso.org/projects/odt/Fiera/| 
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corrections for complex geometries. Each of these methods is given an index 
which is stored in the database, constituting the only really "free" parameter 
in the system. 

In addition, the bias offset exhibits a residual pattern, which is measured 
by the master bias frame, an instance of the BiasFrame class. To construct 
the master bias, a series of A'' (usually 5-10) zero-second bias exposures is 
overscan-corrected and averaged, rejecting Scr outliers (cr = readout noise from 
a ReadNoise object), due to particle hits during read-out. The resulting master 
bias frames will be used for the correction of all frames. 

As the read-out noise dominates the RMS scatter in the bias frames, while 
the shot noise of the sky background dominates the RMS scatter on the sky 
images, which is nominally much larger than the readout noise, it is sufficient 
to characterize the bias value at individual pixels with an accuracy of (readout 
noise /VN). 



2.4 DarkCurrentI 

In AWE, no formal dark frame subtraction is performed. Current, liquid nitro- 
gen cooled instruments tend to have little or no appreciable two-dimensional 
dark current structure, any of which will normally be removed with the sky 
background. As AWE was created explicitly for such an instrument, dark frame 
correction was not included. There is, however, some treatment of this effect 
through the |DarkCurrent | class. The purpose of this class is to determine the 
total dark current and the particle event rate of a detector chip. This is not 
used for calibration, but for the detector chain health. 

The dark current, excess signal due to heat in a detector chip, is measured 
by taking 3 identically timed exposures (typically one hour) with the camera 
shutter closed. The resulting frames are trimmed, overscan- and bias-corrected, 
then a median is taken along the Z-axis of the exposure stack. After iterative 
outlier rejection, the average value of all the pixels is the dark current in units 
of ADU/pixel/hour. 

The same trimmed, overscan- and bias-corrected frames are used to de- 
termine the particle even rate. The source extraction software SExtractoip^ 
is used on each image in turn to detect the number of cosmic ray particle 
events. A iHotPixelMap can optionally be used to mask detected hot pixels. 
The particle event rate is determined in units of particles/cm^ /hour. 



2.5 HotPixelM^ and 'ColdPixelM^ 



Hot pixels are pixels which have high count rates despite not being illuminated. 
In AWE, these pixels are detected from bias images (which have an exposure 
time of seconds). More precisely: greater than 5ct outliers in bias are defined 
as hot pixels. Cold pixels are broken pixels which have low or zero counts even 



http: / / astromatic.iap.fr/software / sextractor/| 
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when illuminated. These pixels are determined from dome flat-field exposures 
because those have the most uniform and consistently high counts required. 
Twilight flat-fields can be used if no dome flat-fields are available. In AWE, all 
pixels that deviate substantially, i.e., more than 4% of its surroundings, from 
the other pixels in the flat-field are considered cold even though brighter pixels 
are also detected. All deviant pixels are flagged in weight maps, a mask image, 
where good pixels have a value of 1 and bad pixels a value of 0. 

The procedure to create a HotPixelMap starts with calculating a back- 
ground map of the master bias frame and subtracting it. This is done to avoid 
detecting induced charge structures and other continuous structures as hot 
pixels. Outliers in the background-subtracted master bias frame are bad/hot 
pixels. A HotPixelMap is created using the threshold determined from itera- 
tive statistics estimates. The number of hot pixels is noted as a quality control 
value. 

The procedure to create a ColdPixelMap starts with smoothing the flat- 
field image. The smoothed flat is used to normalize, or "flatten" the flat to 
eliminate large deviations from flatness that could erroneously cause entire 
regions to be marked as "cold" . In this flat-field image, pixels that are outside 
a given range (±4%) are taken to be cold pixels. Note that this invalidates 
any pixel whose gain differs significantly from its immediate neighbors. In 
particular, this also identifies pixels that are bright relative to their neighbors 
as "cold" . Note, that pixels above the threshold are formally not cold, but are 
flagged anyway. In the end, HotPixelMap ^ and ColdPixelMaps are combined 
into weights of the detrended science images. A ColdPixelMap is created using 
the thresholds given above. The number of cold pixels is noted as a quality 
control value. 

We use SExtractor to produce the smoothed images. SExtractor uses a 
robust algorithm to estimate the background on a grid and interpolate between 
these grid points. By measuring this background for the bias and flat-field 
we essentially have a fast smoothing algorithm with a large kernel, that is 
relatively insensitive to bad pixels. 



2.6 Flat fielding 

A fiat-field is the response of the telescope-camera system to a source of uni- 
form radiation. In AWE, there are different ways to construct a flat field. Dome 
flat-fields are created by pointing the telescope at a screen on the inside of the 
dome which is illuminated by lamps. Dome flat fields have the advantage (over 
twilight flat fields) that it is easy to repeatedly obtain a high signal to noise 
level. Disadvantages are that the direction in which light enters the telescope 
may be different than during night time observations, that the color of the 
dome lamp differs from the color of the night sky and that it is very difficult 
to illuminate a screen in such a way that it is a source of uniform radiation. 
A dome flat field is useful for tracing small scale structure variations. A dis- 
advantage for twilight flats is that they can already contain objects like stars 
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during exposures, which should be corrected for by dithering the twihght flats. 
Twihght flat fields thus are better in tracing large scale structure variations. 
These considerations result in the desire to combine dome flats and twilight 
flats by spatially filtering the two types of fiat fields. 



2. 6. 1 .DomeFlatFramA 



A 'DomeFlatFremie is obtained through an average with sigma rejection proce- 
dure on a stack of raw dome flats, intended to reduce photon shot noise and 
remove cosmic rays. 

The procedure to make a 'DomeFlatFrame' starts with 5-10 overscan cor- 
rected, trimmed and debiased raw dome fiats. These are normalized to the 
median, taking into account hot and cold pixels, and averaged rejecting Su 
outliers: the median in Z-axis of the stack is used to determine the cr levels. 
The computed mean in the Z-axis of the stack is the final DomeFlatFrame] 
image. Lastly, sub-window image statistics are determined for quality control 
purposes. 



2. 6. 2 TwilightFlatFrame\ 



A TwilightFlatFrame is obtained through an average with sigma rejection 
procedure on a stack of raw twilight flats, intended to remove any contam- 
ination (including stars) present on individual raw twilight fiats and reduce 
photon shot noise. 

The procedure to make a |TwilightFlatFrame| starts with 5-10 overscan- 
corrected, trimmed and debiased raw dome flats. These are normalized to the 
median, taking into account hot and cold pixels, and averaged rejecting 5cr 
outliers: the median in Z-axis of the stack is used to determine the a levels. 
The computed mean in the Z-axis of the stack is the final TwilightFlatFrame 
image. Lastly, sub-window image statistics are determined for quality control 
purposes. 



2.6.3 ]Nigh tSkyFl a tFrame\ 

Raw science images have a non-fiat background, attributed to fiat-field ef- 
fects. Information about how to fiat-field science images therefore is present 
in the science images themselves. The fiat-field that most closely reproduces 
the actual gain variations of the these images can be obtained by averaging a 
large number of flat-fielded science and standard observations, taking care of 
properly masking the contaminating objects. Such a night-sky fiat could, in 
principle, improve on the quality of the twilight fiat and may also be suitable 
for fringe removal. 

The procedure to create a NightSkyFlatFrame starts with a minimum of 
5 non-cospatial science images within a given night in a given band to achieve 
optimal results. Images are overscan-corrected, trimmed, debiased, fiat-fielded 
and normalized, then stacked and a median along the Z-axis is calculated. This 
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median is intended to remove any exposure-specific effects (objects, cosmic 
rays, satellite tracks, etc.). The median image is then normalized to the mean 
taking into account hot and cold pixels. 

2. 6.4 Mas terFla tFram^ 



In AWE, a MasterFlatFrame constructed from a DomeFlatFrame (used to mea- 
sure the small-scale pixel-to-pixel variation) and a TwilightFlatFrame (used 
to measure the large-scale variation). These spatial frequencies are separated 
using a Fourier technique. NightSkyFlat Frame^ are created from raw science 
or standard data that has been flat-fielded with this master flat-field and can 
be used to improve the quality of it. This (improved) master flat-field is then 
used to flat-field the science and standard images in the image pipeline. 

In practice, not all three flat-field types are available. As a result, AWE offers 
three different combination methods: 

1. the |MasterFlatFrEune| is constructed by extracting high spatial frequency 
components from the DomeFlatFrEune and low spatial frequencies from the 
TwilightFlatFrame, multiplied to give the master flat 



2. the MasterFlatFrame is a direct copy of the DomeFlatFrame 



3. the Maste rFlatFrame| is a direct copy of the [Twil ightFlatFrame 



In all cases a NightSkyFlatFrame can be provided which is multiplied with 
this master flat-field as an improvement on it as mentioned above. 

In certain situations, it may be advantageous to split the [DomeFlatFrame] 
and TwilightFlatFrame contributions out of the process. The machinery of 
AWE allows this to be accomplished in a straight-forward manner. The ad- 
vantages to this would be in isolating either large-scale (low spatial frequen- 
cies) or small-scale (high spatial frequencies), pixel-to-pixel variations of the 
TwilightFlatFrsime or DomeFlatFrame, respectively. This concept will be ex- 



plored further in Sect. 2.8.4 



To give a more detailed description, low spatial frequencies are extracted 
from the master dome and master twilight flats by the process indicated below. 
The high spatial frequencies of the dome flat are obtained by dividing the dome 
flat by its low spatial frequency components. The low spatial frequencies of the 
twilight flat are then multiplied by the high spatial frequencies of the dome 
flat. 

Low spatial frequencies are extracted as follows: 

— all bad pixels in input images are replaced by the median value of the pixels 
in a box around the bad pixel 

— to reduce problems with Fourier filtering near image edges the size of the 
image is increased by mirroring the edges and corners 

— a two-dimensional array is created containing the equivalent of a circular 
Gaussian convolution function in Fourier space (taking into account the 
quadrant shift introduced by the Fourier transform) 

— the Fourier transform of the image is multiplied by the Gaussian filter 

— the image is transformed back, and the mirrored regions removed 
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— the resulting image is normalized, excluding bad pixel values 



2. 6.5 FringeFrarri^ 

Fringing requires a different approach to background subtraction. Fringing in a 
solid state detector chip is due to interference of incident photons with photons 
reflected in the detector chip substrate. The photons causing the strongest 
fringes are those of several skylines, mostly apparent at the long wavelengths, 
that can vary with filter. Normally, after flatfielding, the background can be 
expected to be flat over the entire image, and a median of the image, excluding 
5ct outliers, would in principle be sufficient to subtract the background. 

In images that suffer from fringing we have to deal with a background 
that is variable on small (<C 1') scales within the image, and can not be 
distinguished from sources. The image itself can, therefore, not be used to 
determine the background. However, the information of several images can 
be combined to determine a background. This average should include enough 
observations to properly exclude contamination from sources. 

A suitable strategy to construct a fringed background image, usable for 
subtraction, thereby removing the fringe pattern, remains to be determined. 
If the fringe pattern is stable over the night, a decomposition of the night-sky 
flat in an additive and multiplicative term is feasible. The assumption that 
the high-frequency spatial component in the night-sky flat are fringes, while 
the lowest frequency components represent gain variations has been used with 
reasonable success. 

The procedure to create a 'Frin geFreime| starts with a minimum of 3 non- 
cospatial science images of reasonably long exposure time (e.g., greater than 30 
seconds) within a given night in a given band to achieve optimal results. Images 
are overscan-corrected, trimmed, debiased, flat-fielded and normalized, then 
stacked and a median along the Z-axis is calculated. This median is intended 
to remove any non-systematic effects (objects, cosmic rays, satellite tracks, 
etc.). The median image is then normalized to the mean taking into account 
hot and cold pixels. The value of 1.0 is subtracted from the normalized fringe 
map to obtain an average value of zero. Bad pixels are assigned a value of zero 
by multiplying by the combined hot and cold pixel maps. 

During a night the brightness of the emission lines will change, especially 
near evening and morning twilight. The result of this is that the amplitude of 
the observed fringes will change. Therefore, fringe maps should be scaled to 
fit the amplitude of the fringes in each science frame. This is calculated from 
the standard deviation in a science image, which is derived from all non-bad 
pixels that have values within a given threshold from the median background 
level. It is assumed that this standard deviation depends on the amplitude of 
the fringes. 
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2.7 lAstroinetricParainetersl 

Astrometric calibration is a vital, integral part of any astronomical data reduc- 
tion and analysis system. AWE performs two kinds of astrometric calibration of 
pixel data. Their results are termed local astrometry and global astrometry. The 
goal of the global astrometry is to improve on the local astrometry. Unlike all 
the previous calibrations, the resulting AstrometricParameters objects are 
each linked to a single processed science observation (a single detector chip of 
one exposure), as it is that observation that provides the source positions to 
be calibrated via the astrometric solution. 



The local astrometric solution (see Sect. 2.7.1 ) is derived on the basis of a 
single detector chip's information. It is obtained by minimizing the differences 
between the RA and DEC positions of sources in a single detector chip and 
their positions listed in a catalog of astrometric standards. The global astro- 



metric solution (see Sect. 2.7.2 1 can be obtained if one has dithered (nearly 
cospatial and cotemporal) observations and local astrometric solutions for 
each detector chip. It then additionally minimizes the positional differences 
of sources appearing on more than one detector chip. This results in a higher 
accuracy of the astrometric calibration. The use of global astrometry improves 
the image quality of a coaddition of dithered observations compared to local 
astrometry. 

In AWE, astrometric solutions are solved by running LDAC (Leiden Data 
Analysis Centeip^ C programs on catalogs extracted from reduced pixel data. 
The C programs are wrapped in Python to allow interaction with the object- 
oriented database model employed by AWE. In local astrometry, all the steps 
in the astrometric solution (pre-astrometric correction, association, formal so- 
lution, etc.) are handled by the LDAC programs. In global astrometry, all the 
steps are also handled by LDAC except for the initial cross-correlation (called 
association) of sources which is handled by the AWE database (via advanced 
queries). This offers a performance advantage because the data to be associ- 
ated already resides in the database to be used in any combination as needed. 



2. 1. 1 Local astrometry 



Local astrometry in AWE starts with a ReducedScienceFrame that has some 
basic astrometry, directly from the telescope or updated sometime prior inges- 
tion. In a parallel environment, the|ReducedScienceFrame is run through the 
AstrometricParcmietersTask, a Python convenience recipe interacting with 
the database, whereby various C programs wrapped in Python solve for the 
astrometry on the catalog level. SExtractor is run to extract the initial cata- 
log. After this, LDAC tools perform all subsequent operations: pre-astrometric 
fitting to solve for large (approximately arcminute level) offsets, scaling, and 
rotations using the any all-sky catalog for reference (e.g., USNO, 2MASS, etc.). 
This pre-astrometry is then applied to the catalog and it is formally associated 

ftp:/ /ftp.strw.leidenumv.nl/pub/ldac/software/pipeline.pdf| 
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with the reference catalog with offsets that are now on the order of arcseconds. 
During the process, only the most stellar-like and best quality objects, as de- 
termined by SExtractor flags (for saturation, incomplete objects on the edge 
of a detector chip, blended objects, etc.) are retained. The catalog is then run 
through the LDAC . astrom program where the final astrometry is determined 
(least-squares fit to a 2-degree polynomial) and residuals catalog created. The 
last step is converting the distortion correction to world coordinates prior to 
storing the solution parameters in the database and the residuals catalog on 
the dataserver. These final residuals are now on the level of accuracy of the 
reference catalog used. 

The residuals catalog output from the LDAC . astrom program contains 
residuals of the form DRA = RAref-RAidac and DDEC = DECre/-DEC/dac, 
where RAidac and DEC/dac are the coordinates of the extracted sources, cor- 
rected for all distortions by the LDAC programs, and RA^e/ and DEC^e/ 
are the coordinates of the reference sources from the reference catalog used. 
The residual plots created by the AstrometricParEoneters inspect method 
plot information directly from this residuals catalog and show what is to be 
the expected precision of the correction when the ReducedScienceFrame is re- 
gridded into a RegriddedFrame, After the local astrometric solution is created. 



the information can be applied to create a Regr iddedF refflie| (see Sect. 3.5) and 
eventually a CoaddedRegriddedFrsmie (see Sect. 3.6). 



2.7.2 Global astrometry 



The most important concept in the global solution in AWE is that it is local. 
It is local in the sense that it uses the extra information of a set of dithered 
observations that are closely matched both temporally and spatially (e.g., 
exposures taken within one to two hours with more than 90% of each detector 
chip participating in the overlap region, respectively!^ The extra information 
characteristic of a closely matched dither consists of the smooth variations 
in time of the optical system distortions and the large amount of overlap 
of the detector area. Combining the distortion information with the overlap 
information allows the global solution to attain the higher precision needed 
for proper coaddition of the source frames. This loca/-global astrometry is the 
only method of global astrometry certified in AWE. 

The process of (7/o6a/-global astrometry is quite different. It involves com- 
bining those dithers from widely different observation times, using indepen- 
dent derivations of the optical system distortions, but combining all overlap 

Global astrometry in AWE is based on the concept of fixed focal-plane geometry. This 
means that any difference in the apparent focal plane from pointing to pointing is assumed 
to change in a linear fashion only, with higher order distortions remaining constant (i.e., 
only relative translations of the entire focal plane in RA and Dec are corrected for). This 
asumption of fixed focal-plane geometry adds information to the system, benefiting the 
astrometric solution. Generally, only sets of exposures taken temporally close and spatially 
close will match this criteria. These two conditions minimize differences in telescope flexure 
caused by different altitude and azimuth locations, and maximize the number of objects 
common to all exposures. 
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Fig. 7 An example of improvement from the local to the global solution. Both panels show 
astrometric residuals, in arcsec, ARA = RAi — RA2 and ADec = Deci — Dec2, where RAi 
and DECi are the source positions from any one frame, and RA2 and DEC2 are the source 
positions of all matching sources in one of the other frames, same or different detector chip, 
that overlaps the first. The top panel shows the overlapping source position differences from 
32 frames of a 4-point WFI dither regridded using the local solution (limits scaled to match 
lower panel), the bottom panel shows the same for the same frames regridded using the 
global solution. The improvement in precision is greater than a factor two. 
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information available from overlapping dithers. It allows for the discontinu- 
ity among dithers that the Zoca^-global process cannot. This type of global 
astrometry is not present in AWE at this time. 

Global astrometry in AWE starts with the GAstromSourceListTask, a Py- 
thon convenience recipe interacting with the database that creates a special 
SourceList (see Sect. 3.71 from the source ReducedScienceFrame using the 
[AstrometricParcmieters information created by the local solution. This is 
done in a parallel environment and only if the SourceLists don't already 
exist. Next, the GAstromTask recipe is run in a serial environment as only a 
single thread is needed. It associates the source position information from the 
associated SourceList, residing solely in the database, using an Associate^ 
[List object (see Sect. 3.8). This step replaces the LDAC . associate stage in 
the local solution. After the association, LDAC . astrom is run on the associated 
data using a least-squares fit to a 3-degree polynomial (as opposed to a 2-degree 
polynomial in the local solution) , and like the local solution, a residuals catalog 
is created. 

The residuals catalog output from the LDAC . astrom program in this case, 
contains two sets of residuals, one identical to that of the local solution with 
respect to the reference catalog used (see Sect. 2.7.1), and the other with 
respect to the overlapping extracted sources. The latter residuals are of the 
form DRA = RA2-RA1 and DDEC = DEC2-DEC1, where RAi and DECi 
are the coordinates of the extracted sources from a given frame and RA2 and 
DEC2 are the coordinates of the extracted sources from another pointing, 
same or different detector chip, that overlaps the first, both corrected for all 
distortions by the LDAC. astrom program. The residual plots created by the 
"GAstrometric inspect method plots both sets of residuals directly from this 
residuals catalog, both by individual detector chip and for all detector chips 
combined, and shows what is to be the expected precision of the global solution 
used to combine a set of RegriddedFrames into a CoaddedRegriddedFrame 

After the global astrometric solution is created, the information is used 
to create a new lAstrometricParametersI instance for each Red ucedScience-| 
'Frame'that went into the solution. The parameters and statistics for the global 
solution are computed and stored on a per frame basis and likely will not match 
those values of other frames from the same solution. As with the local solution, 
these parameters can be applied to create RegriddedFrsimes (see Sect. 3.5 ) and 
eventually a CoaddedRegriddedFrame (see Sect. [3^, but with much greater 



precision than with the local solution only (see Fig. [t] for an example using 
WFI data). 



2.8 IPhotometricParametersI 

The photometric pipeline in AWE is aimed at calibrating large imaging surveys 
taken with multi-detector chip wide-field imagers during many nights and 
different epochs. Instrumental characteristics specific for wide-field imagers 
need to be accounted for in a survey photometric pipeline. For example: 
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— detector chip-to-dctcctor chip variations. Each detector chip has its own 
small and large-scale variations in pixel gain and can have a different me- 
dian gain. There can also be detector chip-to-detector chip variations in 
the non-linearity behavior of count rates or color terms of the photometric 
calibration. 

— Illumination variation. Several wide-field imagers are known to have illumi- 
nation variations (e.g., MEGACAM at CFHT, WFI at ESO/MPG 2.2.m). 
The gain variation over individual detector chips is characterized by flat- 
fields under the assumption of an ideal flat illumination over the field of 
view. In practice this ideal flat illumination can be affected by stray and/or 
scattered light (sky concentration) yielding variations of up to a few tenths 
of a magnitude in amplitude. 

— Shutter timing. The large FoV requires carefully designed shutter mecha- 
nisms. Shutter timing variations might result in position dependent expo- 
sure times. 

Performing a survey with such an instrument poses several challenges for 
the photometric calibration. Long term, short-term, night-to-night or even 
intra-night variations need to be monitored to create a homogeneous photo- 
metric calibration across the whole survey area and survey time. It might be 
the case that the very precise photometric calibration is dependent on instru- 
mental variations not captured by a single or handful standard star observa- 
tions per night, e.g., telescope altitude and azimuth. To detect and quantify 
all such effects it is important to explore trends in photometric results as a 
function of many parameters. To obtain the maximum photometric accuracy 
it is required to have observations of photometric standards that densely cover 
the full FoV. 

The goal of AWE photometric calibration is to establish the photometric sys- 
tem resulting from the signal progressing through Earth's atmosphere, tele- 
scope, filter, wide-field camera and each detector chip resulting in a digital 
read-out. The photometric system is characterized in AWE in terms of a multi- 
plication of gains: 

lobs = gff{t,N,X, {x,y)) X ge{to)9e{t)gsei.e{X) X 

{t,N, {x,y),X) X 

Irefj (1) 

where lobs is the observed countrate of a standard star in digital units and 
Iref its emitted physical fiux, t is time and (x, y) position on the detector 
chip. The gain gff{t,N,X,{x,y)) characterizes the fiat field. The gains ge 
characterize the atmospheric extinction: ge{to) is the scaling at time of the 
selected atmospheric extinction curve gsei.e{X) that is a fimction of filter X 
and ge{t) models the change at time t. The gains gq^, characterize the overall 
instrumental quantum efficiency that includes the light losses through the 
optics and conversion from physical units of flux to countrates for detector 
chip N. The illumination variation is captured as a separate gain guium- 

By determining the gains, AWE then gives for each detector chip indepen- 
dently the photometric calibration at any time for any pixel for each filter. 
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The photometric cahbration objects have timestamps to indicate their vaUd- 
ity range in time (see Sect. |l.l] ). Thus AWE holds a continuous representation 
of the photometric system of an instrument if the cahbration plan of the in- 
strument provides the required observations. This is another example of how 
AWE calibrates the instrument instead of a specific data set. 

The gain factors representing atmospheric extinction and instrumental 
quantum efficiency are solved in magnitude space. The involved physics for 
wide-field cameras is well-represented by the common photometric equation 
for astronomical imagers: 

rriinst = -2.5\og{countrate) + ZPT - fc x AM + 

Co - Ci X {mx2 - rrixi) (2) 

where rriinst is the magnitude of the object in the instrumental photometric 
system, the countrate is in ADU/s, k is the atmospheric extinction coefficient, 
AM is the airmass, and Cq.i are the terms describing the corrections to go from 
the standard photometric system to the instrumental photometric system. 
mx2 — rnx3 is the color between filter X2 and X3 of the standard star as 
listed in the catalog of the standard photometric system. 

2.8.1 Atmospheric extinction 

The atmospheric extinction in magnitude space is assumed to be a linear 
function of airmass AM (i.e.,fc x AM in Eqn. [5] which is a representation of 
ge in Eqn. [l] {ge ^ iQ-{kxAM)/2.5'^^ ^pj^g ^^^y^ jg |-p estabhsh the atmospheric 
extinction coefficient k. The airmass is taken from the observational metadata. 

In AWE, the correction for the atmosphere in the photometric calibration 
can be derived in four ways. 

1. Using a pre-defined atmospheric extinction coefficient. The coefficient is 
multiplied by the airmass (see Eqn. [2]). These are stored in the AWE database 
for each combination of instrument and filter object of the class Atmos- 
phericExtinctionCoef f icient. Users can insert their own atmospheric 
extinction coefficients in AWE. 

2. Using a pre-defined atmospheric extinction coefficient plus a shift. It is 
using the coefficient just described plus a shift given by a report represented 
by the class PhotometricExtinctionReport. This kind of atmospheric 
correction on the photometry is represented by the class Atmospheric- 
Ext inct ionCurve . 

3. Using standard star field observations. This kind of correction is repre- 
sented by the class AtmosphericExtinction. There are two sub options 
here: 

(a) Using a single standard star field observation and a given zeropoint. 
Eqn. [2] is the used to determine an atmospheric extinction coefficient. 
This type of atmospheric correction is represented by the Atmospheric- 
ExtinctionZeropoint class. 
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(b) Using two observations of standard star fields at different airmass. By 
equalling the zeropoint in Eqn. [2] one can solve for the atmospheric 
extinction coefficient. This correction type is represented by the class 
AtmosphericExtinctionFrames. 

2.8.2 Color terms 

Differences in the effective throughput per wavelength of the photometric sys- 
tem of the standard system and the instrumental system can be caused, for 
example, by differences in filter transmission curves or in quantum efficien- 
cies of the detector chips. In AWE, it is assumed that these differences can 
be captured by a linear function of the standard star color in the standard 
photometric system: 

'mref,i,inst — "mref^i^std + Cq — Ci X {mi^x2 ~ rrii^xa), (3) 

where mref,i.inst is the magnitude of the standard star i in the instrumen- 
tal photometric system and mref,i^std in the standard photometric system. 
For each combination of instrument and filter the two coefficients are pre- 
determined and stored in AWE. The PhotTransf ormation class represents the 
color transformation, and objects of this class contain the coefficients. The 
magnitudes of the standard star in filters X2, Xi is taken from the standard 
star catalog (a PhotRef Catalog object) stored in AWE. 

2.8.3 Zeropoints 

The fiux counts and astrometry of stars in a photometric standard field are 
measured using SExtractor. The resulting catalog is associated (using the 
prephotom package in LDAC) with known standard stars listed in a refer- 
ence catalog. Now a "raw" instrumental magnitude {mraw,i.inst) and zeropoint 
ZPTraw,i,inst a^c computcd for each observed standard star i: 

mraw,i,inst = -2 .b\og countr ate (4) 
A clipping is applied on the set of raw zeropoints: 

\ZPTran,,^.^nst " median{Z PT,^st,^,raw)\ < M AX _M AG .D I F F, (6) 

with M AX Jvl AG -DI F F set by the user. The result is stored in a photometric 
source catalog represented by the class PhotSrcCatalog 

If at least a required minimum number of standard stars identified in the 
observation remain (the MIN_NMBR_OF_STARS parameter), the final zero- 
point is computed. A sigma clipping with a threshold factor SIGCLIP_LEVEL 
set by the user is applied once to the raw zeropoints. The variance weighted 
mean and its uncertainty are computed from the remaining raw zeropoints. 
This mean is then corrected for the atmospheric extinction yielding the zero- 
point ZPT. The ZPT is stored in a PhotometricParameters| object. Formal 
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errors are propagated from count measurements through the computation of 
zeropoint and atmospheric extinction. 

AWE contains a photometric reference catalog that contains the magnitudes 
of standard stars in Johnson-Cousins system (from Landolt and Stetson), and 
the Sloan system in 22 SA fields. By default, all entries are used from in the 
standard star catalog, but one can limit this to subsets. It is also possible to 
use a custom photometric reference catalog. 



2.8.4 IlluminationCorrection 



The photometric calibration accounts for gain variations under the assumption 
of an ideal flat illumination over the field of view. In practice this ideal flat 
illumination can be affected by stray light (sky concentration) and a correction 
for this effect has to be made. 

It is assumed that the effect of the illumination variations is larger than 
detector chip-to-detector chip systematic variations. It has been verified that 
this holds for the MEGACAM and WFI instruments. The starting point is all 
detectors of a mosaic of a standard star field observation that is detrended up 
to the flat field level in a given filter. The raw zeropoint (see Sect. 2.8.3) is 
determined for each standard star. The residual between these zeropoints and 
their median value over all detector chips in the mosaic is a measure of the 
illumination variation. The residual distribution is assumed to be well-fitted 
with a two-dimensional second order polynomial (as is verified for MEGACAM 
and WFI) using a chi-square minimization. An illumination variation frame is 
created from the polynomial fit for each detector chip. Each standard star field 
frame is divided by this IlluminationCorrectionFrame'and a new zeropoint 
determination is performed per detector chip. This last step corrects for any 
remaining detector chip-to-detector chip variations. 

The resulting illumination correction is applied to ReducedScienceFrame;^ 
in the following manner: the background is removed from the science frames 
and the remaining pixels associated with sources (both calculated by SExtrac- 
tor) are multiplied by the IlluminationCorrectionFrame, The background 
is added back and the zeropoints from the standard star field with illumination 
correction are applied. 

In wide-field instruments (e.g., OmegaCAM), the illumination variation 
pattern across the large detector block can vary with time, telescope position, 
etc. In these cases, an || IlluininationCorrectionFrame| may fail to properly 
characterize the illumination variation and require a different approach. One 
such approach involves compensating for only the pixel-to-pixel variations in 
the flat-fielding as alluded to in Sect. |2.6.4[ A MasterFlatFraine| constructed 
from only the high spatial frequencies of a DomeFlatFrame can be used to 
eliminate the pixel-to-pixel sensitivity variations without adding any illumina- 
tion variation from the low spatial frequency (large scale) contributions. Any 
remaining illumination variation above the background, if it exists, can then 
be corrected for appropriately, either as described above or via robust sky 
subtraction techniques (e.g., with SExtractor). 
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Fig. 8 Schematic flow of the image pipeline following the coloring in Fig.[l] The recipes, also 
called Tasks, used to produce various |ProcessTargetfe are indicated in each box (with their 
data product in parentheses) and described in the various sections. The arrows connecting 
them indicate the direction of processing. Note that the global (multi-chip) astrometry 
branch is optional and supplementary to the local (single-chip) astrometry. Also note, that 
while AssociateList is the formal data product of GAstrom, new |AstrometrlcParameters| 
objects arc created in the process as well. 



Class 


process_param 


value 


units 


ReducedScienceFrame 


ovcrscan.correction 
fringe_threshold_low 
fringe_threshold_high 
image_threshold 


6 
1.5 
5.0 
5.0 




SaturatedPixelMap 


threshold-low 
threshold_high 


50.0 
50000.0 


ADU 
ADU 


[SatelliteMap 




detection_threshold 
hough_thrcshold 


5.0 
1000.0 




RegriddedFrame 


background_subtraction_type 







SourceList 




htm_dcpth 


25 




AssociateList 




search_distance 

single _out .closest _pairs 

sextractor_flag_mask 


5.0 

1 



arcsec 



Table 3 Processing parameters and their default values. These values are representative 
of the typical value for any instrument. Some instruments may have values that different 
from these based on experience with that instrument. See the document page linked from 
the class name or appropriate links on |http: / / doc.astro-wise.org/ astro. main. html| for more 
details. 



3 Image Pipeline: combining the pixels 

As mentioned earlier, one advantage of the AWE is its parallel processing capa- 
bility. Much of the processing is done in a parallel environment, one detector 
chip per CPU node. There are two places in the image pipeline, however, 
where the information of individual detector chips must be combined: the as- 
trometric solution may be derived for all detector chips simultaneously (global 
astrometry), and science images may be coadded into larger mosaics and/or 
deeper images. See Fig. [8] for an overview. 

Many ProcessTarget's have configurable processing parameters to control 
how they are processed. Table [3] gives an overview of these process_paranis 
for the image pipeline. 
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3.1 IReducedScienceFramel 



The most basic outcome of the image pipehne is the ReducedScienc eFrame[ 

Conventional de-trending steps are performed when making this frame: 

1. overscan correction and trimming 

2. subtraction of the BiasFrsune 

3. division by the MasterFlatFrame] 



4. scahng and subtraction of a FringeFrame if indicated 



5. multiphcation by an IlimninationCorre ctionFreLme] if indicated 

6. creation of the individual weight image 

7. computation of the image statistics 

Please note that: 

— the overscan correction can be a null correction (i.e., no modification of 
the pixel values) 

— the illumination correction step (i.e., application of a photometric flat field) 
has had a SExtractor-created background removed and then reapplied after 
the multiplication, and the correction only occurs when requested and if a 
suitable .IlluininationCorrectionFrcune exists 



3.2 |WeightFrame| 

In addition to the effects of hot and cold pixels, individual images may be 
contaminated by saturated pixels, cosmic ray events, and satellite tracks. For 
purposes of subsequent analysis and image combination, affected pixels unique 
to each image need to be assigned a weight of zero in that image's weight map. 

Since the variance is inversely proportional to the Gain, which is propor- 
tional to the fiatfield, the weight is given by: 

^^ij ^hot Pcold Psaturated ^cosmic Psatellite: 

where Wij is the weight of a given pixel, Gij is the gain of a given pixel 
(taken from the flat field) , and the rest of the members are binary maps where 
good pixels have a value of 1 and bad pixels have a value of 0. These maps 
are, respectively, a HotPixelMap, a ColdPixelMap, a SaturatedPixelMap, a 
CosmicMap, and a SatelliteMap, the last three being calculated directly from 



the ReducedScienceFreune| after detrending. 



3.2.1 SaturatedPixelMap 

Saturated pixels are pixels whose counts exceed a certain threshold. In addi- 
tion, saturation of a pixel may lead to dead neighbouring pixels, whose counts 
lie below a lower threshold. These upper and lower thresholds are defined and 
stored in the object. 
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3.2.2 CosmicMa^ 

Two programs may be used to detect cosmic ray events: 

1. SExtractor can be run with a special filter that is only sensitive to cosmic- 
ray-like signal. This requires a 'retina' filter, which is a neural network 
that uses the relative signal in neighboring pixels to decide if a pixel is 
a cosmic. A retina filter, called 'cosmic. ret' is provided. Run SExtractor 
with FILTER_NAME=cosmic. ret, to run SExtractor in comic ray detection 
mode. This results in a so-called segmentation map, recording the pixels 
affected by cosmic ray events. This segmentation can be used to assign a 
weight of zero to these pixels. 

2. CosmicFITS is designed as a stand-alone program to detect cosmic ray 
events. 

In the AWE, the SExtractor method is the preferred cosmic ray event detection 
method. 



3.2.3 SatelliteMapl 



Linear features can be detected using a Hough transform algorithm, which is 



used to find satellite tracks. See Hough ( 1959 ); Duda and Hart ( 1972 ) for more 



information about the Hough transform. 

A point (x, y) defines a curve in Hough space (p, 

p = X cosO + y sin0, 



where: 



corresponding to lines with slopes < < tt, passing at a distance p from 
the origin. This means that different points lying on a straight line in image 
space, will correspond to a single point (p, 9) in Hough space. 

The algorithm then creates a Hough image from an input image, by adding 
a Hough curve for each input pixel which lies above a given threshold. This 
Hough image (effectively a histogram of pixels corresponding to possible lines) 
is clipped, and transformed back into a pixelmap, masking lines with too many 
contributing pixels. 



3.3 lAstrometricParaunetersI 

The parameters from the astrometric solution are used during the regridding 
process and their creation has already been discussed in Sect. |2.7[ 



3.4 IPhotometricPareLmetersI 

The parameters from the photometric solution are used during the coaddition 
process and their creation has already been discussed in Sect. |2.8[ 
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3.5 |RegriddedFrame| 

Regridding and co-adding are done using the SWarpp^ program. Before im- 
ages are co-added, they are resampled to a predefined pixel grid (see Sect.[5|. 
By co-adding onto a simple coordinate system, characterized by the projec- 
tion (Tangential, Conic- Equal- Area) , reference coordinates, reference pixel, 
and pixel scale, the distortions recorded by the astrometric solution are re- 
moved from the images. To this end a set of projection centers is defined, at 
1 degree separation and pixel scale of 0.2 arcsec. AjReducedScienceFrame re- 
sampled to this grid is called a RegriddedFrame, The backgroimd of the image 
can be calculated and subtracted at this time, if desired. 



3.6 ICoaddedRegriddedFreime] 

After the RegriddedFrame ^ are made, it is only a matter of applying the pho- 
tometry of each frame and stacking the result. This process creates a Coadded-, 
[RegriddedFraine} 

One point of great importance in considering the coadded data is its pixel 
units. The units are fluxes relative to the flux corresponding to magnitude=0. 
In other words, the magnitude m corresponding to a pixel value /o is: 

m = -2.5/ogio/o (7) 



The value fout of a pixel in the CoaddedRegriddedFrame is computed from 



all overlapping pixels i in the input RegriddedFrame 3 according to this for- 
mula: 

font = S^{w^ * FLXSCALE, * fi)/S^{wi), (8) 

where fi is the pixel value in the RegriddedF rame} FLXSCALEi is calculated 
from the zeropoint, and Wi = weighti/ FLXSCALEf where weighty is the 
value of the pixel in the input weight image. A WeightFrame is created as 
well. The value Wout of the pixel in the weight frame for the coadd is: 

Wont ^ Si{Wi) (9) 



3.7 ISourceListl 

In AWE, source information from processed frames can be stored in the database 
in the form of SourceLists. These are simply a transcription of a SExtractor- 
derived catalog values (position, ellipticity, brightness, etc.) into the database. 
Normally, the catalog was derived from a processed frame existing in the sys- 
tem, but this is not a requirement. Arbitrary SExtractor catalogs meeting a 
minimum content criteria can be ingested as well. This is how large survey 
results and reference catalogs are brought into the system. 
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http://astromatic.iap.fr/software/swarp/ 
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These |SourceListp can be used for a variety of purposes such as astro- 
metric and photometric correction, but are normally an end product of the 
image pipeline storing key quantities about the sources in question for further 
analysis. Multiple SourceList 5 can be combined into an A ssociateList[ and 
later into another |SourceList| via the |CombinedList machinery. 



3.8 'As sociateListI 



Multiple SourceList 5 can be spatially combined (VIA RA and DEC values) 
and stored in the database via the AssociateList. class. The association is 
done in the following way: 



1. The area of overlap of the two SourceList? is calculated. If there is no 



overlap no associating will be done. 

2. The sources in one 'SourceList" are paired with sources in the other if they 
are within a certain association radius. Default radius is 5//. The pairs get 
an unique associate ID (AID) and are stored in the |AssociateList[ A 
filter is used to select only the closest pairs. 

3. Finally the sources which are not paired with sources in the other list and 
are inside the overlapping area of the two [Source List are stored in the 
[AssociateList ' as singles. They too get an unique AID. 

Very important is the type of association being done. One of three types: 
chain, master or matched, will be done. In a chain association, all subsequent 
SourceList^ are matched to the previous [SourceList[ to find pairs, in a master 



association, they are always matched with the first SourceList, and in a 
matched association, all SourceLists are matched with all other SourceLists. 



4 Summary 

The development and implementation of the Astro-WISE optical pipeline has 
been described. This pipeline uses the Astro-WISE Environment: an informa- 
tion system designed to integrate hardware, software and human resources, 
data processing, and quality control in a coherent system that provides an 
unparalleled environment for processing astronomical data at any level, be it 
an individual user or a large survey team spread over many institutes and/or 
countries. 

The Astro-WISE Environment is built around an Object-Oriented Program- 
ming (OOP) model using Pythonwhere each data product is represented by the 
instantiation of a particular type of object. The processability and quality of 
these data objects (ProcessTargets) is moderated by built-in attributes and 
methods that know, for each individual type of object or OOP class, how to 
process or qualify itself. All progenitor and derived data products are trans- 
parently linked via the database, providing an uninterrupted path between 
completely raw and fully processed data. 
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This data lineage and provenance allows for a type of processing whereby 
the pipeline used for a given set of data is created on-the-fly for that particu- 
lar set of data, where the Unix make metaphor is employed to chain backward 
though the data, processing only what needs to be processed (target process- 
ing) . This allows unparalleled efficiency and data transparency for reprocessing 
the data when necessary, as the raw data is always available when newer tech- 
niques become available. 

Calibration of data follows the usual routes, but has been optimized for 
processing of OmegaCAM calibration data meant for detrending survey data. 
In this process, data is processed and reprocessed as more and more knowl- 
edge of the instrument system (from the optics through detector chain) is 
acquired. This effectively calibrates the instrument, leaving the data simply to 
be processed without the need of users find or qualify their own calibrations. 
Various attributes of calibration objects (validity, quality, valid time ranges) 
transparently determine which calibrations are best to be used for any data. 
Processing parameters are set and can be reset as desired. These parameters 
are retained as part of the calibration object and guarantee that a given ob- 
ject can be reprocessed to obtain the same result or be tweaked to improve 
the result. The processing of science data is governed by the same validity, 
quality, valid time range, and processing parameter mechanism that is used 
for calibration data. 

The calibration pipeline starts with a ReadNoise object created from |Raw-| 
'BiasFrames that is used to determine a clipping limit for BiasFrame creation. 
A GainLinearity object can be processed from a special set of RawDomeFlat-l 
[Framef taken for the purpose. From this result, both the gain (in e~/ADU) 
and the detector linearity can be determined. A master BiasFrame' is created 
from a set of RawBiasFrames to remove 2-dimensional additive structure in 
detectors. The jDarkCurrent is measured for quality control of the detectors, 
but is not applied to the pixels. Bad pixels in a given detector can be found 
from the BiasFramie and a flat field image. These are termed HotPixelMap] 
and ColdPixelMap, respectively. 

Flat field creation in Astro-WISE can be very simple or very complex. On 
the simple side, a single set of RawDomeFlatFrame,3 or RawTwilightFlat^j 
[Frames can be combined with outlier rejection and normalized to the median. 
On the complex side, high spatial frequencies can be taken from the Dome^ 
FlatFrameand the low spatial frequencies from the TwilightFlatFrame, In 
addition, a NightSkyFlatFrame can be added to improve this result. For an 
additional refinement to the flat field correction for redder filters, a |Fringe-| 
[Frame can be created. 

Astrometric calibration starts with extraction of sources from individual 
'ReducedScienceFrames. The source positions are matched to those in an as- 
trometric reference catalog (e.g., USNO-A2.0) and all the positional differences 
minimized with the LDAC programs. This local solution can then be further re- 
fined by adding overlap information from a dither to form a global astrometric 
solution. Astrometric solutions are always stored for each ReducedScience-] 
[Fraine[ individually. Photometric calibration also starts with source extraction 



31 



(as a PhotSrcCatalog) and positional association. Then, the magnitudes of 
the associated sources are compared to those in a photometric reference cat- 
alog (e.g., Landolt) and the mean of the Kappa-sigma-clipped values results 
in a zeropoint for a given detector for the night in question. The extinction 
can be derived from multiple such measurements, the results of both being 
stored in a PhotometricParsmieters object. As an optional refinement to the 
photometric zeropoint, a photometric super flat can be constructed by fitting 
magnitude differences as a function of radius across the whole detector block. 
The result of this is stored in an IlluminatiorLCorrectionFrame object. 

The image pipeline takes all the calibrations from BiasFrsmie through 
iMasterFla tFrame to transform a RawScienceFrcune into a ReducedScience^ 
Frame I This includes trimming the image after applying the overscan cor- 
rection, subtracting the BiasFrame, dividing by the MasterFlatFrame, and 
applying the FringeFrame and IlluminationCorrectionFrame if necessary. 
The WeightFrame is constructed by taking the HotPixelMap and ColdPixel^ 
Map and combining them with a SaturatedPixelMap, a SatelliteMap, a 



iCosmicMap", and optionally a 


illuminationCorrectionFrame, These are all 


applied to the MasterFlatFrame to create the final WeightFramie 


Next, the 


fc.strometricParcmieters 


is applied to the ReducedScienceFrame 


in creating 


the RegriddedFrame, and the PhotometricParameters 


is applied to multi- 



pie RegriddedFrames to form a CoaddedRegriddedFrsmie, Lastly, the sources 
from one CoaddedRegriddedFrame can be extracted into a | SoyLrceLis tand 
associated with other SourceList^ to form an AssociateList object. This 
last is the final output of the image pipeline and can combine information 
from multiple filters on the same part of the sky into one data product. 

Using AWE, The KIDS survey team has begun processing each week's worth 
of data taken at the VST (more than half a terabyte) in a single night. The 
part of the data that requires it (bad quality or validity) is reprocessed nightly 
as necessary to gain the required insight into the different aspects of the cali- 
bration process: detrending calibrations, astrometric calibrations, and photo- 
metric calibrations. 

The Astro-WISE Environment is a unique multi-purpose pipeline for astro- 
nomical surveys. All required tools (ingestion, processing, quality control, and 
publishing) are integrated in an intuitive and transparent way. ft has already 
been used to process archive WFI@2.2m, MegaCam@CFHT (CFHTLS), and 
VlRCam® VISTA data in pseudo-survey mode in preparation for its main task: 
processing KIDS, Vesuvio, OmegaWhite, and OmegaTrans survey data from 
the newly commissioned OmegaCAM@VST. 



5 Appendix: skygrid of projection centers 

Tables |4]& [5] describe a grid on the sky for projection and co-addition purposes 
in a condensed format. It contains 95 strips as function of decreasing declina- 
tion {0° > 6 > —90°). For each strip the size in degrees and the number of 
1° X 1° fields per strip is given. The last column contains the overlap between 
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fields in %. By mirroring the grid along the equator one obtains a grid for the 
northern hemisphere. The combination of the grids for both hemispheres is a 
grid for the entire sky. 
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4.4 


247 


4.2 


242 


4.1 


237 


4.0 


232 


4.0 


227 


4.0 


222 


4.0 


217 


4.0 


212 


4.1 


207 


4.1 


202 


4.3 


197 


4.4 


192 


4.7 


187 


4.9 


182 


5.2 


176 


4.9 


170 


4.7 


164 


4.5 


158 


4.3 


152 


4.1 


146 


3.9 


140 


3.7 


134 


3.6 


128 


3.4 


122 


3.3 


116 


3.2 


110 


3 1 


104 


3.1 


98 


3.0 


92 


3.0 


86 


3.0 


80 


3.1 


74 


3.2 


68 


3.3 


R9 
DZ 


O.O 


56 


3.8 


50 


4.2 


44 


4.7 


38 


5.5 


32 


6.5 


26 


8.1 


19 


5.3 


13 


8.1 


7 


16.4 


1 





